AITopics | feature norm

Collaborating Authors

feature norm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Prediction is not Explanation: Revisiting the Explanatory Capacity of Mapping Embeddings

Herasimchyk, Hanna, Abdelhalim, Alhassan, Laue, Sören, Regneri, Michaela

arXiv.org Artificial IntelligenceAug-20-2025

Understanding what knowledge is implicitly encoded in deep learning models is essential for improving the interpretability of AI systems. This paper examines common methods to explain the knowledge encoded in word embeddings, which are core elements of large language models (LLMs). These methods typically involve mapping embeddings onto collections of human-interpretable semantic features, known as feature norms. Prior work assumes that accurately predicting these semantic features from the word embeddings implies that the embeddings contain the corresponding knowledge. We challenge this assumption by demonstrating that prediction accuracy alone does not reliably indicate genuine feature-based interpretability. We show that these methods can successfully predict even random information, concluding that the results are predominantly determined by an algorithmic upper bound rather than meaningful semantic representation in the word embeddings. Consequently, comparisons between datasets based solely on prediction performance do not reliably indicate which dataset is better captured by the word embeddings. Our analysis illustrates that such mappings primarily reflect geometric similarity within vector spaces rather than indicating the genuine emergence of semantic properties.

information overlap, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2508.13729

Country:

Europe (0.67)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

PLoP: Precise LoRA Placement for Efficient Finetuning of Large Models

Hayou, Soufiane, Ghosh, Nikhil, Yu, Bin

arXiv.org Machine LearningJun-26-2025

Low-Rank Adaptation (LoRA) is a widely used finetuning method for large models. Its small memory footprint allows practitioners to adapt large models to specific tasks at a fraction of the cost of full finetuning. Different modifications have been proposed to enhance its efficiency by, for example, setting the learning rate, the rank, and the initialization. Another improvement axis is adapter placement strategy: when using LoRA, practitioners usually pick module types to adapt with LoRA, such as Query and Key modules. Few works have studied the problem of adapter placement, with nonconclusive results: original LoRA paper suggested placing adapters in attention modules, while other works suggested placing them in the MLP modules. Through an intuitive theoretical analysis, we introduce PLoP (Precise LoRA Placement), a lightweight method that allows automatic identification of module types where LoRA adapters should be placed, given a pretrained model and a finetuning task. We demonstrate that PLoP consistently outperforms, and in the worst case competes, with commonly used placement strategies through comprehensive experiments on supervised finetuning and reinforcement learning for reasoning.

large language model, machine learning, natural language, (21 more...)

arXiv.org Machine Learning

2506.20629

Country:

Asia > Thailand > Bangkok > Bangkok (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Mahalanobis++: Improving OOD Detection via Feature Normalization

Mueller, Maximilian, Hein, Matthias

arXiv.org Artificial IntelligenceMay-26-2025

Detecting out-of-distribution (OOD) examples is an important task for deploying reliable machine learning models in safety-critial applications. While post-hoc methods based on the Mahalanobis distance applied to pre-logit features are among the most effective for ImageNet-scale OOD detection, their performance varies significantly across models. We connect this inconsistency to strong variations in feature norms, indicating severe violations of the Gaussian assumption underlying the Mahalanobis distance estimation. We show that simple $\ell_2$-normalization of the features mitigates this problem effectively, aligning better with the premise of normally distributed data with shared covariance matrix. Extensive experiments on 44 models across diverse architectures and pretraining schemes show that $\ell_2$-normalization improves the conventional Mahalanobis distance-based approaches significantly and consistently, and outperforms other recently proposed OOD detection methods.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2505.18032

Country: Europe (0.27)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
(2 more...)

Add feedback

AI-enhanced semantic feature norms for 786 concepts

Suresh, Siddharth, Mukherjee, Kushin, Giallanza, Tyler, Yu, Xizheng, Patil, Mia, Cohen, Jonathan D., Rogers, Timothy T.

arXiv.org Artificial IntelligenceMay-19-2025

Semantic feature norms have been foundational in the study of human conceptual knowledge, yet traditional methods face trade-offs between concept/feature coverage and verifiability of quality due to the labor-intensive nature of norming studies. Here, we introduce a novel approach that augments a dataset of human-generated feature norms with responses from large language models (LLMs) while verifying the quality of norms against reliable human judgments. We find that our AI-enhanced feature norm dataset, NOVA: Norms Optimized Via AI, shows much higher feature density and overlap among concepts while outperforming a comparable human-only norm dataset and word-embedding models in predicting people's semantic similarity judgments. Taken together, we demonstrate that human conceptual knowledge is richer than captured in previous norm datasets and show that, with proper validation, LLMs can serve as powerful tools for cognitive science research.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.10718

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.94)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Local and Global Feature Attention Fusion Network for Face Recognition

Yu, Wang, Wei, Wei

arXiv.org Artificial IntelligenceDec-5-2024

Recognition of low-quality face images remains a challenge due to invisible or deformation in partial facial regions. For low-quality images dominated by missing partial facial regions, local region similarity contributes more to face recognition (FR). Conversely, in cases dominated by local face deformation, excessive attention to local regions may lead to misjudgments, while global features exhibit better robustness. However, most of the existing FR methods neglect the bias in feature quality of low-quality images introduced by different factors. To address this issue, we propose a Local and Global Feature Attention Fusion (LGAF) network based on feature quality. The network adaptively allocates attention between local and global features according to feature quality and obtains more discriminative and high-quality face features through local and global information complementarity. In addition, to effectively obtain fine-grained information at various scales and increase the separability of facial features in high-dimensional space, we introduce a Multi-Head Multi-Scale Local Feature Extraction (MHMS) module. Experimental results demonstrate that the LGAF achieves the best average performance on $4$ validation sets (CFP-FP, CPLFW, AgeDB, and CALFW), and the performance on TinyFace and SCFace outperforms the state-of-the-art methods (SoTA).

artificial intelligence, feature norm, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2411.16169

Country:

Oceania > Australia > Western Australia > Perth (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > China > Hubei Province > Wuhan (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Effective Subset Selection Through The Lens of Neural Network Pruning

Bar, Noga, Giryes, Raja

arXiv.org Artificial IntelligenceJun-3-2024

However, the annotation task can be very expensive in some domains, such as medical data. Thus, it is important to select the data to be annotated wisely, which is known as the subset selection problem. We investigate the relationship between subset selection and neural network pruning, which is more widely studied, and establish a correspondence between them. Leveraging insights from network pruning, we propose utilizing the norm criterion of neural network features to improve subset selection methods. We empirically validate our proposed strategy on various networks and datasets, demonstrating enhanced accuracy. This shows the potential of employing pruning tools for subset selection.

feature norm, randomization, selection, (14 more...)

arXiv.org Artificial Intelligence

2406.01086

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.46)
Information Technology (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Exploring Simple, High Quality Out-of-Distribution Detection with L2 Normalization

Haas, Jarrod, Yolland, William, Rabus, Bernhard

arXiv.org Artificial IntelligenceJan-31-2024

We demonstrate that L2 normalization over feature space can produce capable performance for Out-of-Distribution (OoD) detection for some models and datasets. Although it does not demonstrate outright state-of-the-art performance, this method is notable for its extreme simplicity: it requires only two addition lines of code, and does not need specialized loss functions, image augmentations, outlier exposure or extra parameter tuning. We also observe that training may be more efficient for some datasets and architectures. Notably, only 60 epochs with ResNet18 on CIFAR10 (or 100 epochs with ResNet50) can produce performance within two percentage points (AUROC) of several state-of-the-art methods for some near and far OoD datasets. We provide theoretical and empirical support for this method, and demonstrate viability across five architectures and three In-Distribution (ID) datasets.

feature norm, machine learning research, normalization, (14 more...)

arXiv.org Artificial Intelligence

2306.04072

Genre: Research Report (0.85)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Feature Norm Regularized Federated Learning: Transforming Skewed Distributions into Global Insights

Hu, Ke, Qiu, WeiDong, Tang, Peng

arXiv.org Artificial IntelligenceDec-11-2023

In the field of federated learning, addressing non-independent and identically distributed (non-i.i.d.) data remains a quintessential challenge for improving global model performance. This work introduces the Feature Norm Regularized Federated Learning (FNR-FL) algorithm, which uniquely incorporates class average feature norms to enhance model accuracy and convergence in non-i.i.d. scenarios. Our comprehensive analysis reveals that FNR-FL not only accelerates convergence but also significantly surpasses other contemporary federated learning algorithms in test accuracy, particularly under feature distribution skew scenarios. The novel modular design of FNR-FL facilitates seamless integration with existing federated learning frameworks, reinforcing its adaptability and potential for widespread application. We substantiate our claims through rigorous empirical evaluations, demonstrating FNR-FL's exceptional performance across various skewed data distributions. Relative to FedAvg, FNR-FL exhibits a substantial 66.24\% improvement in accuracy and a significant 11.40\% reduction in training time, underscoring its enhanced effectiveness and efficiency.

algorithm, distribution skew, participant, (13 more...)

arXiv.org Artificial Intelligence

2312.06951

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > Virginia (0.04)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

E-Sparse: Boosting the Large Language Model Inference through Entropy-based N:M Sparsity

Li, Yun, Niu, Lin, Zhang, Xipeng, Liu, Kai, Zhu, Jianchen, Kang, Zhanhui

arXiv.org Artificial IntelligenceOct-24-2023

Traditional pruning methods are known to be challenging to work in Large Language Models (LLMs) for Generative AI because of their unaffordable training process and large computational demands. For the first time, we introduce the information entropy of hidden state features into a pruning metric design, namely E-Sparse, to improve the accuracy of N:M sparsity on LLM. E-Sparse employs the information richness to leverage the channel importance, and further incorporates several novel techniques to put it into effect: (1) it introduces information entropy to enhance the significance of parameter weights and input feature norms as a novel pruning metric, and performs N:M sparsity without modifying the remaining weights. (2) it designs global naive shuffle and local block shuffle to quickly optimize the information distribution and adequately cope with the impact of N:M sparsity on LLMs' accuracy. E-Sparse is implemented as a Sparse-GEMM on FasterTransformer and runs on NVIDIA Ampere GPUs. Extensive experiments on the LLaMA family and OPT models show that E-Sparse can significantly speed up the model inference over the dense model (up to 1.53X) and obtain significant memory saving (up to 43.52%), with acceptable accuracy loss.

arxiv preprint arxiv, e-sparse, sparsity, (11 more...)

arXiv.org Artificial Intelligence

2310.15929

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.46)

Industry: Information Technology (0.36)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

A Method for Studying Semantic Construal in Grammatical Constructions with Interpretable Contextual Embedding Spaces

Chronis, Gabriella, Mahowald, Kyle, Erk, Katrin

arXiv.org Artificial IntelligenceMay-29-2023

We study semantic construal in grammatical constructions using large language models. First, we project contextual word embeddings into three interpretable semantic spaces, each defined by a different set of psycholinguistic feature norms. We validate these interpretable spaces and then use them to automatically derive semantic characterizations of lexical items in two grammatical constructions: nouns in subject or object position within the same sentence, and the AANN construction (e.g., `a beautiful three days'). We show that a word in subject position is interpreted as more agentive than the very same word in object position, and that the nouns in the AANN construction are interpreted as more measurement-like than when in the canonical alternation. Our method can probe the distributional meaning of syntactic constructions at a templatic level, abstracted away from specific lexemes.

computational linguistic, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2305.18598

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York (0.04)
Europe > Croatia > Dubrovnik-Neretva County > Dubrovnik (0.04)
(14 more...)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback